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New particles at the TeV-scale may have sizeable decay rates into boosted Higgs bosons 
or other heavy scalars. Here, we investigate the possibility of identifying such processes 
when the Higgs/scalar subsequently decays into a pair of W bosons, constituting a highly 
distinctive "diboson-jet." These can appear as a simple dilepton (plus $r) configuration, 
as a two-prong jet with an embedded lepton, or as a four-prong jet. We study jet sub- 
structure methods to discriminate these objects from their dominant backgrounds. We then 
demonstrate the use of these techniques in the search for a heavy spin-one Z' boson, such 
as may arise from strong dynamics or an extended gauge sector, utilizing the decay chain 
Z' — > Zh — > Z(WW^). We find that modes with multiple boosted hadronic Zs and Ws 
tend to offer the best prospects for the highest accessible masses. For 100 fb _1 luminosity 
at the 14 TeV LHC, Z' decays into a standard 125 GeV Higgs can be observed with 5a 
significance for masses of 1.5-2.5 TeV for a range of models. For a 200 GeV Higgs (requiring 
nonstandard couplings, such as fermiophobic), the reach may improve to up to 2.5-3.0 TeV. 



I. INTRODUCTION 



The hunt for the Higgs boson is now in full swing at the LHC Assuming Standard Model 
production and decay, the available mass range has shrunk dramatically over just the past 
year, with a favored region emerging between about 120 and 130 GeV, and hints of signals 
near 125 GeV While this bodes well for the minimal Higgs scenario suggested by 

electroweak precision and flavor observables, the story is of course far from over. Even if 
the Higgs is ultimately found in this range, it will immediately come under very detailed 
scrutiny to verify that both its production and decay rates are indeed standard. Another 
possibility is that the Higgs, or one of its scalar cousins, is still hiding in the already excluded 
range, but with smaller rate than the Standard Model prediction. 

While we continue to wait for a concrete result either way, we anticipate that the Higgs 
boson will be merely the first of many exciting discoveries at the LHC. In fact, the Higgs 
itself will continue to play a central role in the search for new physics by serving as a signal 
of the production of new particles, much like gauge bosons and fermions currently serve as 
signals of the Higgs. Such new physics signals are of particular interest for illuminating the 
full dynamics of electroweak symmetry breaking, as new particles with large couplings to the 
Higgs may give us clues about the Higgs's origins. However, LHC searches for new sources of 
Higgs production can be nontrivial in practice, owing to the fact that Higgs decays usually 
involve jets. In particular, for new particles much heavier than the Higgs, this can lead to 
complications in standard object reconstructions - the jets and/or leptons from the Higgs 
decay can become merged due to the small angles incurred by the large transverse Lorentz 
boost, forming a single "Higgs-jet." Therefore, it behooves us to take a closer look at how 
these signals might be uncovered. 

Since the Higgs boson has still not been conclusively seen, and since we do not even know 
if its properties are strictly those predicted by the Standard Model, any phenomenological 
study of boosted Higgses faces an immediate two-pronged question: What is the Higgs's 
mass and how does it decay? In previous publications US], a subset of the present authors 
explored the possibility of a Standard Model Higgs near the LEP bound (m^ ~ 115 GeV), 
identified via the decays h —> bb and h — > t + t~ . Utilizing the tools of jet substructure, as 
well as some novel modifications to tau-tagging algorithms, we found that these dijet and 
ditau systems could be effectively discriminated from QCD jets, even at p^'s well above 
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1 TeV and opening angles of O(0. 1). We then applied these reconstruction techniques to the 
search for qq — > Z' — > Zh, where the Z' is any new neutral spin-1 resonance with multi-TeV 
mass. We found sensitivity in a variety of different final-state channels up to Z' masses 
near 3 TeV, assuming a 14 TeV LHC and 0(100 fb" 1 ) of luminosity. This mass is near the 
electroweak precision lower bound often cited for warped extra dimension and composite 
Higgs models js, 7], which had previously been estimated to be accessible for a light Higgs 
only after ab _1 -scale luminosity upgrades js]. 

In this paper, we extend those previous results to the much richer four-body decay modes 
h — > WW^*' I 'Z Z^ — > 4 fermions, which we call "diboson-jets." We consider in detail the 
two cases of a 125 GeV Higgs decaying to WW* and a 200 GeV Higgs decaying to doubly- 
resonant WW. With the current limits from the LHC, the latter becomes viable only 
if this "Higgs" has nonstandard couplings (e.g., fermiophobic), which would already be a 
strong indication of additional TeV-scale physics. It can also be a neutral scalar not directly 
associated with electroweak symmetry breaking, but which nonetheless decays like a heavy 
Higgs due to mixing |9|. For example, in the case of Z' models, it may be a scalar associated 
with U(l)' breaking. Given that scalar — > WW^/ZZ^*' decays can actually be rather 
generic, we expect that the techniques which we explore here will have wider applications. 

The WW'*' diboson-jets have the usual ensemble of secondary decay modes available to 
systems with two W-bosons. Neglecting decays with taus for simplicity, these are dileptonic 
{WW^ M Uj BR = 5%), semileptonic -> luqq', BR = 30%), and fully hadronic 

(WW^ — > qq'qq', BR = 45%). In the dileptonic case, care must be taken in defining lepton 
isolation, but otherwise their identification is straightforward lOj, 111]. In the semileptonic 
and fully hadronic cases, we get configurations that look qualitatively similar to QCD jets, 
and must apply dedicated jet substructure techniques to fully reconstruct the decay kine- 
matics. These techniques allow us to establish that these jet-like systems in fact contain, 
respectively, a lepton and a subjet doublet, or a quadruplet of subjets. In either case, the 
total (transverse) mass of the diboson-jet will be approximately rrih, and we may be able 
to pick out one or more on-shell W subsystems. This level of detailed reconstruction can 
then form the basis of dedicated semileptonic and fully hadronic diboson-jet tags, whose 
performance we describe in more detail below. 

With these substructure techniques in hand, we demonstrate their utility for new physics 
searches by returning to the example of a multi-TeV Z' decaying into Zh. Such new res- 
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onances are ubiquitous in models where the Higgs originates as a composite particle from 
a new strongly-interacting sector, and these Z' decays can be dominated by Zh (and the 
S[/(2) L -related mode W + W~). (See, e.g., The decay Z' — > Zh also occurs with ap- 

preciable rate in simple U(l) '-extended gauge sectors, such as a TeV-scale hyp er char ge/B — L 
admixture or bosons from E$ unification (reviewed in 13] ). 

We conduct a broad multi-channel survey of discovery prospects for Z' — > Zh — > 
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well-known techniques for tagging boosted hadronic Z-bosons |14N21|. We find that chan- 
nels incorporating hadronic decay modes and jet substructure are the most powerful, with 
many different channels displaying similar performance. The most promising amongst these 
at masses above 3 TeV include the "dijet with embedded lepton" channel Zh — > (qq)(lvqq') 
(previously studied in js]), as well as the "monojet" channel (vv)(qq'qq'). For masses below 
about 2 TeV, the former remains powerful, and the channels (qq)(lulu), (i/i?)(li/qq'), and 
{l + l~){lvqq') also become comparable or better. Of course, in analogy to the ongoing Stan- 
dard Model Higgs search itself, the answer to the question "which channel is best?" really 
depends on the Z' mass range of interest and the available luminosity, and statistical combi- 
nations across channels can offer nontrivial benefits. We perform a simple estimate of such 
a combination, and consider the discovery reach for a baseline set of Z' models. We find 
that a 100 fb" 1 run of the 14 TeV LHC has the potential to discover Z' -> Zh ->■ Z(WW W ) 
with masses of up to 1.5-2.5 TeV in the case of a 125 GeV Higgs. For the 200 GeV Higgs, 
the reach potentially increases to 2.5-3.0 TeV, though the actual reach inevitably depends 
on the mechanism used to "hide" this Higgs from LHC searches. 

The paper is organized as follows. In the next section, we discuss techniques for tagging 
boosted hadronic and semileptonic diboson-jets. In Section 1111} we survey our studies of Z' 
discovery prospects in a variety of different channels. We provide some closing comments in 
Section IIVI Finally, we include two appendices. In Appendix |Aj we give a more complete 
account of the technical details of our discovery estimates. In Appendix El we elaborate on 
our detector modeling. 
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FIG. 1: Example event displays of fully hadronic (left) and semileptonic (right) diboson-jets in 
the rj-cp plane. These are from 125 GeV h — > WW* decays originating from a 3 TeV Z' , passed 
through our simple detector model and represented at ECAL-level spatial resolution. Identified 
subjets are color-coded according to relative px (in descending order: red, blue, green, yellow). The 
semileptonic example includes a non-isolated muon (cyan). (Grey cells were thrown away by the 
substructure algorithm.) 



II. TAGGING DIBOSON-JETS 



In this section, we outline some techniques for tagging highly Lorentz-boosted diboson 
systems, generated in the decay of a boosted Higgs or other heavy scalar. We specialize 
to the case of h — > WW^*\ which is the dominant diboson decay mode of the Higgs. Our 
approach can be straightforwardly extended to ZZ^*\ or generally any decay chain that 
leads to a four-body final state with any admixture of jets and leptons. 

We first discuss the case of fully hadronic M^W" decay, and then semileptonic decay. We 
assume that the fully leptonic decay can largely be dealt with using standard lepton recon- 



struction, perhaps wit 
leptonic Z bosons in 



i somewhat loosened isolation criteria (as was explored for boosted 
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11]). Examples of fully hadronic and semileptonic diboson-jets 
from boosted Higgs decay can be seen in Fig. [U Note that these examples, with px > 1 TeV 
could easily fit inside of a standard- sized LHC jet (R > OA). 1 



1 For p T ~ 1 TeV and m h < 200 GcV, at least two fermions are merged at the Ai? < 0.4 level in essentially 
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To study these decay modes under semi-realistic conditions, we turn photons and hadrons 
into "detector level" objects in the form of calorimeter cells. The finite spatial resolution of 
the calorimeter represents an obstacle to reliable jet substructure as we approach angular 
scales of O(0. 1), exacerbated by the fact that a single particle will generally deposit energy 
into a cluster of nearby cells rather than staying confined to a single cell. This tends to 
introduce spurious substructure and to obscure any substructure that is actually present, 
especially as the momentum scale exceeds a TeV. We use the present study as an opportunity 
to explore methods to undo this effect. In a typical LHC detector, the electromagnetic 
calorimeter (ECAL) captures an 0(1) fraction of the total jet energy, and traces its angular 
distribution with 0(0.02) resolution. This information can be combined with the hadronic 
calorimeter (HCAL), which is spatially coarser but must be included for a complete energy 
measurement. We employ such a combination in the context of a highly simplified detector 
model. We defer a detailed discussion to Appendix [Bj but use the results of this approach 
throughout the rest of the paper. 



A. Fully Hadronic 



The four-prong fully hadronic decay of 1^^'*' can be dealt with much in analogy to two 



prong and three-pron g de ca y s such as from boosted W/Z and h —> bb 14l-l22| or boosted 



top quarks 
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23M26| (see also [27|, |28j). Also, just as these previous techniques can 



be generalized to other new physics searches (e.g., R-parity violating supersymmetry 29], 
colored hadrons of a new confining sector 30|), a fully hadronic four-body diboson-t agger 
may have other applications. A trivial extension automatically covered by the present study 
is boosted ZZ^*\ but we can also contemplate scenarios such as boosted Higgs decays to 



pseudoscalars h — > aa — > Ag or 4b (also studied using substructure methods in [3l|, |32|]), or 
any other multi-stage decay that ends in four quarks or gluons. For our purposes in this 
paper, we will simply consider the cases of a 125 GeV h — > WW* or 200 GeV h — >■ WW. 
While the rich substructure of four-prong decays should make their discrimination from 



every event. The most widely separated fermions, on the other hand, can extend out to AR ~ 1: for the 
125 GeV (200 GeV) Higgs, Ai? max < 1.0 in 90% (75%) of events. Lower px and/or higher-mass diboson- 
jets are broader. Below, we attempt to capture most of the decay products in a single large-radius jet that 
covers about half of the detector. However, this approach ultimately becomes inefficient for px < 300 GeV 
(500 GeV). 
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QCD-induced jets more straightforward than it is for two-prong or three-prong decays, we 
should also bear in mind that the energy of the parent particle has to be distributed between 
more objects. This often leads to a decay with a relatively soft parton and/or a parton that 
is somewhat widely separated in angle from the bulk of the jet activity. Consequently, while 
our first intuition may be to simply run two iterations of a two-body boosted object tagger, 
as is done for the Hopkins/CMS top-tagger 
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we are often faced with situations where 



the first iteration distributes the partons unequally between the first level of subjets (1+3 
instead of 2+2), or where the softest and/or widest-angle parton is accidentally thrown away. 
To implement our diboson-t agger, we instead work with a technique that borrows some 



ideas from the three-body neutralino-tagger of 29]. (Ideas along the lines of the other 
multibody substructure methods cited above would also be interesting to explore.) We 
first iteratively cluster the event into R = 1.5 quasi-hemispheric "fat-jets" with the Cam- 



bridge/Aachen clustering algorithm 34|, |35j], as implemented in Fast Jet 36]. After a fat-jet 
of interest is identified, we scan over its clustering history. Working backwards, this can be 
viewed as a tree-like structure consisting of a sequence of 1 — > 2 splittings. Each splitting 
presents us with two protojet branches j\ and ji, with Pt(ji) > Pt{J2) by definition. We 
characterize all of the splittings with a fractional px measure with respect to the original fat- 
jet, Pr(j2)/PT(ifat), and a mass-to-p^ ratio of the softer branch, m(j 2 ) / 'prC/2) • If the former 
is smaller than a threshold 5 P = 0.03, or the latter is larger than a threshold 5 m / PT = 0.3, we 
label the splitting as "soft." 2 We also automatically label "soft" all splittings downstream 
of the j'2 branch of a soft splitting. The remaining splittings we label "hard." For these, 
each of the two branches carries an appreciable fraction of the jet px, the softer branch is 
not diffuse, and the splitting does not occur within a soft / diffuse branch. From amongst the 
hard splittings, we take the three with the largest mass, m(ji + j'2). Out of the six branches 
emanating from these three special splittings, there will always be exactly four branches 
that do not contain any of the other splittings. We take these branches as our four subjets, 
discarding the rest of the particles that originally constituted the fat-jet. 3 
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The m/pT cut serves to eliminate would-be subjets that are really just diffuse clouds of soft radiation 



It is somewhat similar to the mass-drop criterion of |l6j . but we find that it is more effective and less 



redundant with the px asymmetry cut, at least in the context of Higgs bosons with TeV-scale momenta. 
3 It is worth pointing out that this procedure can be generalized to arbitrary numbers of desired subjets, 
including two or three, by simply ch ang ing the number of massive, hard splittings requested. The two- 



body case essentially degenerates to [16j, and the three-body case should overlap substantially with [29 1. 
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FIG. 2: Normalized distributions of reconstructed hadronic diboson mass, applying four-body sub- 
structure to 125 GeV (left) and 200 GeV (right) Higgs-jet candidates. Signal (black) is from 2 TeV 
Z' decay, and backgrounds of quark-jets (blue) and gluon-jets (red) are from PYTHIA Z+jets sam- 
ples of pt — 1 Te V. Dashed lines indicate application of the multibody kinematic cuts discussed in 
the text, and are suppressed by the associated efficiencies. 



With these four subjets, we can easily reconstruct the complete diboson system mass. 
This is illustrated in Fig. [2], for fully-showered h — > WW^ Monte Carlo events passed 
through our simple detector model. The remaining issue is to determine whether their 
multibody kinematics look consistent with the partons from a PFW**- 1 decay. The kinematics 
will clearly be different depending on whether we are below or above the WW threshold. 
For the former, looking at the pair of subjets with the highest combined mass successfully 
reveals the on-shell W peak, whereas the W* does not yield a well-localized feature. For 
the latter, the doubly-on-shell decay can be revealed by considering all partitionings of the 
subjets into two pairs, and taking the partitioning that gives the smallest difference between 
the masses of the pairs. The average of the two pair masses then typically comes out close 
to the W mass. Both reconstructions are illustrated in Fig. [3j 

QCD jets can sometimes fake these features through parton showering. In fact, jets that 



Higher numbers of subjets might be interesting to consider, for example, as a kind of alternative (and 
self-groom ed) jet-finding algorithm that does not have a built-in minimum distance. This is analogous 

to ml. [37I [38 [ . 
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FIG. 3: Normalized distributions of reconstructed hadronic W mass, applying four-body substructure 
to 125 GeV (left) and 200 GeV (right) Higgs-jet candidates. Signal (black) is from 2 TeV Z' 
decay, and backgrounds of quark- jets (blue) and gluon-jets (red) are from PYTHIA Z+jets samples 
°f Pt — 1 TeV. Events are restricted to m^yy = [90,145] GeV or [160,220] GeV, respectively. 
Dashed lines indicate application of the ^j^\V m pair kinematic cuts discussed in the text, and are 
suppressed by the associated efficiencies. 



survive the above declustering procedure with mass near m^ almost automatically contain 
subjet pairs with mass near my/. By studying fully showered jets in PYTHIA and HERWIG, 
we have found that a simple additional discriminating variable is the dimensionless ratio 
m pair/ m pa1r5 taken between the subjet pairs with the minimum and maximum invariant 
mass. (The pairs need not be exclusive with respect to each other.) The distributions can 
be seen in Fig. HJ For our 125 GeV working point, we found that a minimum cut of 0.2 is 
quite effective. Such a cut appears to subsume any cuts on the W mass, while being strictly 
more powerful. The pairwise mass ratio is also quite effective for our 200 GeV working 
point, and is complementary to a cut demanding that the average pair mass is near my/. 4 
The full tagger is then built on the following kinematic cuts. For the 125 GeV Higgs, we 



4 Of course, the full four-body phase space of the decay at either mass point is quite rich, and in principle 
a more aggressive multivariate approach could be even more powerful. It might also be possible to fold 
in detailed information characterizing the distributions of particles in and around the subjets (see, e.g., 
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m h = 125GeV 



m h = 200GeV 




FIG. 4: Normalized distributions of the ratio between minimum subjet-pair mass (rn™?) and max- 
imum subjet pair mass (rri^P), applying four-body substructure to 125 GeV (left) and 200 GeV 
(right) Higgs-jet candidates. Signal (black) is from 2 TeV Z' decay, and backgrounds of quark-jets 
(blue) and gluon-jets (red) are from PYTHIA Z+jets samples of pt ~ 1 TeV. Events are restricted 
to m™ co = [90, 145] GeV or [160,220], respectively. Dashed lines indicate application of the 
kinematic cuts discussed in the text for the 200 GeV analysis, and are suppressed by the associated 
efficiencies. 





p T ^ 500 GeV 


p T ^ 1000 GeV 


p T ^ 1500 GeV 


h (125 GeV) 


0.44 


0.49 


0.51 


quark — > h 


0.015 (0.018) 


0.018 (0.024) 


0.027 (0.036) 


gluon h 


0.039 (0.055) 


0.040 (0.056) 


0.043 (0.054) 



TABLE I: Tag rates for 125 GeV boosted Higgs bosons decaying to hadronic WW*, and mistag 
rates for PYTHIA (HERWIG) QCD jets. 



demand that the four-subjet mass lies in the window [90, 145] GeV, and that m^'P/m 



pair/ pair 



> 



0.2. For the 200 GeV Higgs, we require a window of [160, 220] GeV, m^/m^ > 0.2, and 



m 



w 



> 50 GeV, according to the above prescription. 



We indicate the tag rates for boosted Higgses and mistag rates for quark- and gluon-jets in 
Tables HI and ILT1 (The fractional statistical errors on these numbers are percent-scale.) Signal 
samples are from MadGraph v4.5.1 44j interfaced with PYTHIA, and background samples 
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Pt — oUU \je\ 


Pt — iUUU LieV 


Pt — loUU Lrev 


a (2U0 Liev ) 


n Qn 

u.oy 


U.50 


U.51 


quark — > h 


4.9xl0~ 3 (7.2xl0~ 3 ) 


5.6 xlO" 3 (8.4 xlO" 3 ) 


6.6 xl0~ 3 (0.010) 


gluon — > h 


0.013 (0.020) 


0.016 (0.026) 


0.016 (0.027) 



TABLE II: Tag rates for 200 GeV boosted Higgs bosons decaying to hadronic WW, and mistag 
rates for PYTHIA (HERWIG) QCD jets. 



(virtuality-ordered) and HERWIG 6 . 520 46| (interfaced with 



are from PYTHIA v6.4.15 
JIMMY Q). 

The tagger has a typical signal acceptance of about 50%, and mistag rates that range 
from 0.5% to 5%. Generally speaking, gluon-jets have 2-3 times higher mistag rates than 
quark-jets, and HERWIG (angle-ordered) jets can have roughly 50% higher mistag rates than 
PYTHIA (virtuality-ordered) jets. Mistag rates for the 125 GeV h ->• WW* are about 2-3 
times bigger than for the 200 GeV h —> WW. Removing the detector model and reverting to 
particle level decreases mistag rates by about a factor of 2, mainly due to better resolution 
on Tag and mistag rates are both fairly stable versus pt- 



B. Semileptonic 

When a boosted WW^ pair decays semileptonically, we in principle get a much cleaner 
tag, assuming that we can reliably identify the lepton in the presence of nearby hadronic 
activity. For 1 (3) TeV Z' decay into a Z and a 125 GeV Higgs, the lepton is more than 
AR = 0.4 away from the accompanying jet activity in only 40% (5%) of events, so normal 
isolation-based lepton identification will often fail. The issue of lepton identification inside 
of a jet has been dealt with before in the context of semileptonic top-jets 24l. I48l45ll]. There 



it has been pointed out that the high mass scale of the decay can be exploited to efficiently 
separate out signal leptons from heavy flavor decays or fakes, even for very small AR scales. 
The separation appears to be good enough that, to first approximation, it is valid to ignore 
these sources of background. In particular, 49|, |5l| showed that for dijet-like configurations 
with a lepton near one of the jet axes, electroweak bremsstrahlung of a leptonic W was a 
much bigger worry than heavy flavor decays. 

The main technique which we utilize for lepton identification is mini-isolation, introduced 
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in 49]. Real or fake leptons produced inside of very energetic jets are almost always accom- 
panied by nearby showering and decay products. For the specific case of leptons from heavy 
flavor decay, the relevant AR scale is set by the parent quark's mass-to-pr ratio. While 
this can become quite small — much smaller than a standard lepton isolation cone — an 
isolation variable built using a miniature cone tailored to this AR and tallying only tracks 
should be quite robust for vetoing heavy flavor up to almost arbitrarily high energy. 5 In 
contrast, leptons produced from boosted top or Higgs decays have a much larger intrinsic 
AR scale for the same pt, and are therefore fairly clear of extra radiation at the scale of the 
miniature cone associated with heavy flavor. 

More explicitly, mini-isolation utilizes a cone that scales inversely with the lepton px'- 
Riso = (15 GeV) /px{l)- The px of all tracks inside of this cone is scalar-summed, and the sum 
not allowed to exceed 10% of the lepton px- While the original version of mini-isolation was 
applied only to muons, subsequent experimental simulations have applied the same technique 



to electrons and found similarly good performance [5l| • Still, electron identification requires 
much more rigorous quality criteria, for example to discriminate against charged pions that 
deposit a large fraction of their energy in the ECAL. We cannot properly reproduce these 
quality criteria in our own highly simplified analysis, but we do restrict the size of the mini- 
isolation cone to be larger than 0.1 for electrons. This in principle gives enough space to 
properly identify the electron track and its associated tracker/ECAL shower. 6 

Besides the embedded lepton, semileptonic diboson-jets also have an additional layer of 
substructure that we can exploit, in the form of the accompanying subjet- doublet which we 
reveal through the "mass drop" algorithm introduced by Butterworth, Davison, Rubin, and 
Salam (BDRS) la ]. For partially off-shell decays, the subjets may have a mass near my/ or 
may occupy the continuum not far below my/- For double-on-shell decays, the subjets will 
always have a mass near my/. In either case, both the cluster transverse mass of the Ivqq' 
system and the total visible mass of the Iqq' system will be close to m^. This potentially 
gives us a lot of kinematic discrimination power against backgrounds. 

More specifically, in cases where the diboson-jet recoils against visible activity (e.g., 



5 Because this form of isolation is based solely on tracks, it may also be less sensitive to pileup than 
calorimeter-based isolation. 

6 We do not attempt to impose "isolation" within the ECAL, which is also necessary for clean electron ID. 
However, tracker isolation is in any case highly correlated with electromagnetic isolation. 
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m h = 125GeV 



m h = 200GeV 




FIG. 5: Normalized distributions of reconstructed diboson-jet mass distributions, applying semilep- 
tonic substructure to 125 GeV (left) and 200 GeV (right) Higgs-jet candidates. Signal (black) 
is from 2 TeV Z' decay, and backgrounds of W-strahlung (blue), semileptonic top-jets (red), and 
QCD jets with embedded leptons (green) are from MadGraph5 or standalone PYTHIA dijet samples 
of Pt — 1 TeV. Dashed lines indicate application of the mini-isolation and substructure kinematic 
cuts discussed in the text, and are suppressed by the associated efficiencies. 



Z — > or qq), the missing energy vector is a faithful tracer of the neutrino's pr, allowing a 
complete reconstruction in principle. Here, we take a somewhat more minimalistic approach 
by simply setting the rapidity of the neutrino to match that of the four-vector sum of 
the lepton and the two subjets. Given the possible uncertainties in determining the exact 
pointing of the $t vector, and the high sensitivity of our reconstruction to even O(0. 1) errors 
in angle, we also conservatively align the neutrino with the lepton+subjets in 0. We can 
then take the invariant mass of this system as a proxy for the diboson candidate mass. The 
result of this procedure is illustrated in Fig. EJ which displays a clear peak near rrih- In cases 
where the diboson-jet recoils against a system with additional neutrinos (such as Z — > vi>), 
the J£t n o longer gives us a clear indication of the px of the neutrino in the W decay, 
and we must resort instead to the visible mass. While this is also a good mass estimator, 
the distribution is broader, and there is 2-3 times more background under the signal peak 
from background events which would have otherwise had too-high m^^- 

By combining lepton mini-isolation, two-body hadronic substructure, and lepton/subjet 



12 





can r^^AT" 
— 5UU LreV 


— 1UUU Liev 


— loUU (je\ 


a (125 Gev J 


U.48 (U.o4J 


0.45 (O./oJ 


0.43 (0. (0) 


■ 7 i t, / " j_ 1 TIT" \ 

jl n, (virtual vVJ 


l.Ux 1U (U.Ulzj 


2.6x10 (7.3x10 J 


t 1 win — 5 //i o v> i n — 3 \ 

7.1 x 10 (4.8 x 10 J 


jZ -> /i (real W) 


0.044 (0.96) 


0.034 (0.95) 


0.024 (0.94) 


^semilep ^ ^ 


0.084 (0.89) 


0.07 (0.85) 


0.069 (0.83) 



TABLE III: Tag rates for 125 GeV boosted Higgs bosons decaying to semileptonic WW* and mistag 
rates for QCD jets with an embedded lepton. (The numbers in parentheses are tag rates for mini- 
isolation alone.) 

kinematics, we can form our semileptonic diboson tagger. Leptons are matched to the 
closest fat-jet in AR. (As in the fully hadronic case, fat-jets are reconstructed with 
R = 1.5.) For the case of a 125 GeV Higgs, we require the subjet-pair mass to lie in 
the window m^|?° = [20, 100] GeV and that the reconstructed diboson mass should be 
less than 130 GeV. (The same diboson mass criterion is applied regardless of whether we 
can use the neutrino or not.) To further reduce backgrounds from QCD, we also require 
AR(l, jw) > (30 GeV)/pT(ljwjw), where jw refers to either hadronic W^*)-subjet. For the 
200 GeV Higgs, the analogous cuts are m™ co = [60, 100] GeV and m\^ w < 200 GeV, as well 
as AR{l,j w ) > (50 GeV)/p T (lj w j w ). 

In Tables UTTl and HV] we show the tag rates for semileptonic Higgs-jets and QCD jets with 
an embedded hard lepton. We consider three mechanisms for lepton production inside of jets: 
virtual W emission in hadron decays (dominated by radiatively-produced b and c quarks), 
real W bremsstrahlung from left-handed quarks, and semileptonic decays of boosted top 
quarks. We model the first using PYTHIA dijets, including radiative and prompt heavy flavor 
contributions, as well as decays-in-flight of light mesons. (These samples contain a central 
lepton with p? > 25 GeV in approximately 5% of events with TeV-scale jets.) Real W- 



strahlung samples are generated from (W — > simulations in MadGraph5 vl.3.30 [52 1 . 

Semileptonic top samples come from 6-body ti — > (lub)(qq'b) simulations, also in MadGraph5. 

We choose an operating point with Higgs-jet efficiency very similar to the fully hadronic 
tag, namely about 50% across the entire px range. The mistag rates for jets with embedded 
leptons via virtual Ws in hadron decays are extremely small, usually O(10 -3 ) or smaller. 



Mini-isolation alone would yield percent or smaller mistag rates (consistent with 49]), but 
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TABLE IV: Tag rates for 200 GeV boosted Higgs bosons decaying to semileptonic WW and mistag 



rates for QCD jets with an embedded lepton. (The numbers in parentheses are tag rates for mini- 
isolation alone.) 
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TABLE V: Branching fractions for Z(WW^) into various final states (excepting taus). 

the addition of multibody kinematic cuts is also significant. In our subsequent Z' searches in 
the next section, we find that backgrounds of this type are always subleading. The remaining 
important sources of fake semileptonic diboson-jets, namely real W emissions directly off of 
light quarks or in top quark decay, exhibit 1-10% mistag rates. 

III. Z' -> Zh SEARCH 

With our substructure tools in place, we now demonstrate their utility in the context 
of the search for resonant Zh production. We focus on the case of a narrow TeV-scale Z' 
produced in qq annihilation, and assume rrih = 125 GeV or 200 GeV. We derive estimates 
of discovery sensitivity at the 14 TeV LHC, which can have appreciable Z' production cross 
sections at truly multi-TeV masses, generating boosted Higgses with momenta in excess of 
a TeV. We reserve an investigation of sensitivity/exclusion at the 7 or 8 TeV LHC for future 
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work. 7 

Since we are ultimately dealing with a triboson signal, we face a large number of possible 
final states in which we can run a search. To initially reduce the number of options, we 
neglect channels with tau-leptons. But this still leaves us with nine possibilities, spanning a 
large range of branching fractions. The situation is illustrated in Table IVl As usual, channels 
with small branching fractions are typically cleaner because they have more leptonic activity. 
However, even a channel with little background loses its utility if the total number of signal 
events is unobservably small. Also, given the jet substructure tools outlined in the previous 
section, the benefits of trading off a quark for a lepton are not always as dramatic as we 
might first guess. 

Our approach here will be to simply run a survey over most of the channels. The one 
notable exception is the fully hadronic channel Zh — > {qq){qq 'qq') , which faces a substantial 
background from dijet QCD and top pairs, even after exploiting the potential quintuply- 
resonant structure of the signal. (Although for very high-mass searches, where the back- 
grounds have fallen off, this may nonetheless become competitive.) This leaves us with eight 
exclusive channels, with varying degrees of leptonic, hadronic, and invisible activity. 

For each of these eight channels, we simulate the signal and the most important back- 
grounds using leading-order matrix-elements in MadGraph or PYTHIA, supplemented with the 
leading-log virtuality-ordered parton shower in PYTHIA. We subsequently pass the events 
through the simple calorimeter model described in Appendix [B] We then cluster the event 
into quasi-hemispheric fat-jets of R = 1.5, and decluster these as either Z-jets or diboson- 
jets. The physics simulations and reconstructed object definitions are described in more 
detail in Appendix IA1 

The different channels face different issues related to energy/mass resolution and kine- 
matic ambiguities, owing to different numbers of leptons versus jets and different numbers 
of neutrinos. In cases where the Z decays invisibly and/or the system decays semi- 

invisibly, we default to a minimalistic reconstruction of a single effective neutrino: the 
missing energy vector serves as the transverse momentum vector of the neutrino, and the 
neutrino's rapidity is set equal to that of the subsystem composed of the visible VFW^ 

7 In particular, this may serve as an interesting context in which to find the Higgs in the first place (or 
other exotic scalars), especially if it is nonstandard. However, this possibility must be carefully studied 
incorporating constraints from existing Higgs and new physics searches. 
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FIG. 6: Reconstructed Z' lineshapes for a 125 GeV Higgs and mz> = 2 TeV in all eight of our 
analysis channels. Colors indicate Higgs decay mode: Ivlv (blue), luqq' (red), and qq'qq' (black). 
Dashing indicates Z decay mode: l + l" (dashed), uu (dot-dashed), qq (solid). The plot on the right 
is a zoom-in of the plot on the left. 

decay products (subjets and/or leptons). We then (semi-)reconstruct the event mass by 
adding all visible products and this effective neutrino. In channels with a visible Z decay, 
knowing the exact kinematics of the neutrinos is not so crucial for reconstructing the global 
mass of the event, and it is largely adequate to know the summed invisible and approx- 
imate summed p z . In cases with an invisible Z decay, we face an inevitable degradation of 
Z' mass reconstruction, which gets worse as the W/iyM system becomes more invisible. 8 
We illustrate the quality of our Z' mass reconstruction in all eight of our search channels in 
Fig. EJ 

We structure all of our searches using a simple "cut and count" approach. Details of the 
cuts vary from channel-to-channel, but a handful of common requirements can be identified: 

• A tagged diboson-jet recoiling against either a tagged hadronic Z-jet, a leptonic Z , or 

8 However, it is worth pointing out that a Z' produced in qq annihilation is polarized along the beam, and 
is forbidden to decay back down to a scalar plus (longitudinal) gauge boson along the beamline. This 
leads to a bias for central production, and more of the event's energy going transverse. Consequently, the 
degrading effect on the peak can be less of an issue than it is for, say, the transverse mass distribution of 
singly-produced leptonic W bosons. 
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substantial $t- 

• A sizeable ratio between the pt of each side of the event to the reconstructed Z' mass. 

• A reconstructed Z' mass localized in a window centered on the signal. 

The second requirement only applies to cases where the Z is visible. It removes regions of 
phase space with high-v^ but low-px, characteristic of many QCD-induced processes with 
t-channel singularities. Usually, after the first two requirements, the signal appears as a 
localized excess on top of a smoothly-falling background distribution of reconstructed event 
mass. 

After defining our search regions, we characterize the expected significance of the signal 
by taking the common S/ y/~B approximation, and determine the signal cross section required 
to achieve 5a. Since this becomes a very poor approximation for 0(1) number of events, 
and since even nearly-vanishing physics backgrounds may in reality be supplemented by 
instrumental backgrounds, we introduce a statistical "regulator" by enforcing a floor of 
B = 4 events in each channel. This means, for example, that even for channels with 
essentially zero expected background events according to our simple estimates, we require 
at least 10 signal events to claim 5a significance. 

While sometimes one channel may clearly dominate the discovery reach, there is often 
much to be gained by forming a statistical combination. This is especially true for the 
present situation, where the full signal appears distributed across a large number of exclusive 
analyses. We perform our combination by forming a summed event count, with each channel 
weighted by S/B. The final result is equivalent to adding together the significance of all 
channels in quadrature. 

We present our final search reach estimates in Figs. [7] and [8] for the 125 GeV and 200 GeV 
Higgs, respectively. (The WW^*' branching fractions are 22% and 74%.) The plots show 
the minimum a(pp — > Z') x BR(Z' — > Zh) required for a 5a discovery after a 100 fb _1 run 
of the 14 TeV LHC. They are supplemented by plots of the a x BR required for S/B — 1, 
to give some sense of the relative size of the backgrounds and the degree to which we might 
worry about systematic errors. 9 (Here we do not explicitly attempt to estimate systematic 

9 The smallest S/B's which we encounter at a claimed discovery limit are 0(1/10), corresponding to an 
excess of a few hundred signal events on top of a few thousand background events. 
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FIG. 7: Discovery reach with 100 fb^ 1 (left) and a x BR at which S/B = 1 (right), for a 125 GeV 
Biggs. Colors indicate Biggs decay mode: lulu (blue), luqq' (red), and qq'qq' (black). Dashing 
indicates Z decay mode: l + l~ (dashed), uu (dot-dashed), qq (solid). Dark yellow is a simple 
statistical combination. 



errors associated with background shape/normalization.) 

The most powerful individual search channel depends on the Z' mass and LHC luminosity, 
as well as on the precise cuts which have been applied. However, we can infer some general 
features regarding how the channels compare with one another given on our own set of 
choices. Channels with large numbers of leptons, such as the four-lepton Zh —> (l + l~)(lulu), 
have very little background but are so rate-limited that they cannot compete at high masses. 
Indeed, it is the mostly-hadronic modes {qq){lvqq') and {vv)(qq' qq') that dominate the search 
reach above 3 TeV for 100 fb _1 . This is greatly facilitated by the jet substructure methods 
which we have applied (hadronic Z-t&g, semileptonic and hadronic diboson-tags), which 
heavily suppress the otherwise sizeable backgrounds. 10 At masses below 3 TeV, the situation 
is more mixed, but we find that leaders are (qq){lvqq'), (qq)(lulu), and (uu)(luqq). Below 
about 1.2 TeV, (l + l~)(lvqq') becomes dominant. Each of these channels benefits in some way 

10 The channel Zh — > (qq){lvqq') was considered before in which found similar model reach at high 
mass without application of jet substructure techniques. We note that those results capitalized on very 
strict kinematic cuts without application of showering or energy resolution effects, and did not include top 
backgrounds. Still, the question of how to best trade off between substructure cuts and more traditional 
kinematic cuts remains a nontrivial one. 
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Discovery reach, nif, = 200 GeV 
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FIG. 8: Discovery reach with 100 fb^ 1 (left) and a x BR at which S/B = 1 (right), for a 200 GeV 
Biggs. Colors indicate Biggs decay mode: lulu (blue), luqq' (red), and qq'qq' (black). Dashing 
indicates Z decay mode: l + l~ (dashed), uu (dot-dashed), qq (solid). Dark yellow is a simple 
statistical combination. 



from two-body hadronic substructure, and most benefit from the full semileptonic diboson- 
jet tagger. 

Of course, given the typically 0(1) differences in discovery reach that we have found, a 
more complete experimental analysis would be required to make any solid claims of relative 
performance. Still, it is interesting that jet substructure makes so many of these channels 
competitive, whereas a more conservative approach might discard many of the hadronic 
options in favor of modes with more leptons. 

In Fig. [9j we show how the discovery reach evolves with luminosity, picking either the 
best channel for a given mass point or performing the combination of channels. We also 
show four baseline models with sizeable a x BR into Zh: a sequential Z', the warped KK 
Z' bosons of jfj], the rj boson of the unified model, and the Tj, boson of a left-right 
symmetric model with SU(2)r — > U(l) broken at a high scale (reviewed in [l^). 11 (The 

11 We define the Z'-Higgs couplings in the sequential Z' model by replacing Z^ — > Z^ + (mz /mz>) 2 Z' in 
all Z-Higgs couplings. This yields BR(Z' — > Zh) ~ 3%, which is comparable to the BR into electrons 
(or muons). The warped KK Z 1 is a set of three neutral bosons with variable BRs to Zh, described in 
detail in [8J. For the E 6 and Tj| models, we assume that a single "up- type" Higgs dominates electroweak 
symmetry-breaking, in which case the Zh (electron/muon) BRs are respectively 5.3% (3.3%) and 2.3% 
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FIG. 9: Discovery reach for different luminosities for a 125 GeV Higgs (left) and a 200 GeV 
Higgs (right), superimposed with Z' model cross section curves. The displayed luminosities are 
10 fb~ l (blue), 100 fb~ l (purple), and 1000 fb" 1 (dark yellow). Solid lines are a simple statistical 
combination of all modes, and dashed lines are the best individual mode for a given Z' mass. Models 
are indicated by short-dashed lines, and include a Z -sequential boson (green), the warped KK Z' 
bosons o/ ; 3/ (black), and either the r] boson of Eq or a Tj| gauge boson (red). 



last two have rates that are within 10% of one another, and they are represented by a 
single line.) Many of these models have roughly democratic couplings to Standard Model 
fermions and to the Higgs doublet, whereas the KK bosons have small couplings to the 
former interplaying with large couplings to the latter. For the 200 GeV Higgs, which is 



excluded if it has Standard Model-like couplings 



2J, we can instead take as a baseline a 
fermiophobic Higgs (as in, e.g., [55|) with the same U(l)' charge as its SM-like counterpart. 
While there are many ways to hide the Higgs boson, this assumption keeps BR(Z' — > Zh) 
and BR[h — > WW) largely unchanged, and allows the simplest comparison to the 125 GeV 
analysis. Still, any realistic fermiophobic model would also require an additional source 
of electroweak symmetry breaking to give mass to the top quark and other fermions, and 
consequently there will be some degree of suppression of Z' — > Zh since the fermiophobic 



(4.5%). Dilepton resonance searches at the LHC constrain the sequential, E§, and bosons to be above 
about 2 TcV @,0. Electroweak precision constraints suggest that the KK, E 6 , and bosons arc 



heavier than about 3 TeV. 
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Higgs cannot saturate EWSB. 

Within the context of this diverse handful of Z' models, we see that the 14 TeV LHC has 
good discovery prospects for masses well above a TeV. Even for the 125 GeV case, where the 
WW* decay mode is subleading, models with masses in the 1.5-2.5 TeV range are visible 
with 0(100 fb _1 ) luminosity. For the 200 GeV case, the mass reach potentially goes up to 
about 2.5-3.0 TeV, modulo the necessity of additional model-dependence to hide the Higgs 
from direct searches. We have also indicated what might be achieved with a 1000 fb" 1 
luminosity upgrade: 3 TeV or higher mass reach for the 125 GeV Higgs, and better than 
3.5 TeV mass reach for the 200 GeV Higgs. 

If the Higgs indeed resides at 125 GeV, and is largely standard in its couplings, then the 
h — > WW* decay rate is smaller than bb by about a factor of 3. The study of [4j suggests 
that a search for Z' — > Zh — > Z{bb) would be more sensitive, typically by an 0(1) factor in 
a x BR. (The comparison is nontrivial, as the present diboson study can utilize the Z — >• qq 
decays, whereas if the Higgs also decays to jets, even 6-jets, the corresponding channels are 
probably highly background-contaminated.) Nonetheless, even if these estimates are correct, 
a corroborating observation of Z' — > Zh — > Z(WW*) would be extremely useful. It would 
not only serve to verify the existence of the Z', but would improve confidence that the bb 
resonance into which it decays is really the Higgs. Our present results suggest that this is 
not only possible, but that both discoveries may occur on a similar timescale. Of course, 
there is still a chance that Nature will surprise us, either by yielding up a Higgs which is 
actually dominated by WW^*\ or by providing additional scalar states with diboson decay 
modes. 



IV. CONCLUSIONS 

In this paper we have considered how to discriminate boosted WW^/ZZ^ "diboson- 
jet" systems utilizing jet substructure techniques, focusing on the cases of h — > WW* for 
a 125 GeV Higgs and h — > WW for a 200 GeV Higgs. For fully hadronic decays, we 
demonstrated the feasibility of a dedicated tagger by identifying the hardest splittings in 
the diboson-jet, filtering out diffuse radiation, and exploiting the kinematic distributions 
of the resulting four hard subjets. For tag rates of approximately 50%, we achieve mistag 
rates between 0.5-5%. For semileptonic decays, we showed that mini-isolation and two- 
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body hadronic substructure could be combined to powerfully reject QCD jets with embedded 
leptons from heavy flavor decays, as well as jets with W^-strahlung emission and semileptonic 
top-jets. Again working at 50% tag rate, we reject the former at the 10 -4 level, and find 
0(5%) mistags for jets with real Ws. For both decay modes, in order to keep the 

tag/mistag rates stable as we exceed pt — TeV, and as the angular scales shrink to the size 
of individual hadronic calorimeter elements, we have investigated the possibility of a novel 
ECAL/HCAL hybridization. 

As a simple application of these diboson-jet taggers, we have outlined searches for a 
TeV scale Z' — > Zh resonance at the 14 TeV LHC using the various Z(WW^) final- 
state channels. We found that channels with one or more hadronic W and/or Z decays 
are more sensitive than the exclusively leptonic channels. In particular, the leaders are 
Zh — > (qq)(lvqq'), (v^faqq'), (vv)(qq'qq'), (qq)(luli'), and, at masses below about a TeV, 
(l + l~)(lvqq'). Only one of these channels uses the rare but highly distinctive fully leptonic 
WW**' (ivy) decay, and, as we scan up in Z' masses, ultimately runs out of statistics 
faster than channels that capitalize on fully hadronic or semileptonic diboson-jet tagging. 
Taken together, our results suggest that resonances in the truly multi-TeV mass range 
should be discoverable via the diboson decay modes of the Higgs, and that searches for these 
resonances would be substantially facilitated and expanded by jet substructure techniques. 

While this demonstrates the long-term LHC reach using extremely collimated diboson 
systems, we emphasize that our techniques become useful even at momenta only a few times 
higher than the Higgs mass itself, which may be accessible to the current 7 and 8 TeV LHC. 
Understanding Z ! sensitivity at these energies, or even the prospects for first discovery of a 
nonstandard Higgs in Z' decay, is an interesting open topic. 

Although we have exclusively studied the specific decay sequence Z' — > Zh — > Z(WW^) 
for simplicity, we note that many of the same techniques will also apply to W — > Wh — > 
W(WW^), as well as to the subdominant Higgs decay mode h — > ZZ^*\ The latter case in 
particular was considered in the six-lepton final state in 5a |. albeit with very limited reach 
in Z' mass due to the very small total branching fraction. Adding in modes with partial 
or fully hadronic decays of the boosted ZZ^ system is a straightforward generalization 
of our treatment of H^iy 1 -*), and indeed the latter would be picked up for free anyway. 
(This would increase statistics for our 125 GeV and 200 GeV analyses by about 10% and 
30%, respectively, though we did not explicitly include them.) Incorporating the ZZ^ — > 
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l + l~qq modes could offer some interesting supplementary search channels, such as Zh — > 
Z(ZZ^) -> (qq)(l + l~qq), which has similar topology to Zh -»■ Z(WW^) -> {qq){lvqq')- 
Still, the small total rates make these modes less attractive for a multi-TeV resonance search, 
where statistics is a major limiting factor. 

Diboson-jet tagging can have a wider application beyond searches involving boosted Hig- 
gses. Indeed, if the Higgs is near 125 GeV, as suggested by the LHC data, then our Z' search 
could also be accomplished using the dominant Higgs decay mode /z. — ^ 66 Q] (or the sub- 
dominant but highly distinctive ditau mode j^]). However, many models of new physics also 
oredict scalars beyond the Higgs, which can dominantly decay into H / W /( -*^ through mixing 
9[]. A particularly relevant example in the context of renormalizable Z 1 models would be the 
Higgs(es) of the U(l)' sector. Other types of boosted multi-jet cascades from new physics 
can also benefit from our treatment, even if the intermediate state is not WW^/ZZ^. 

There are also many other ways to produce boosted Higgses, such as in heavy top partner 
decays or due to strong EWSB effects in weak boson scattering 571 ] . Having diboson-jet 
tagging available as a tool for identifying these processes could greatly improve sensitivity, 
especially in final states with multiple boosted Higgses where we can include both bb and 
WW* decay modes. 
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Appendix A: Analysis Details 

We generate Z' — > Zh — > Z(WW^) signal samples, keeping all spin correlations, using 
MadGraph v4.5.1 44] interfaced with PYTHIA (virtuality-ordered shower). For background 



samples, we use MadGraph5 vl.3.30 [52]. All processes are evaluated at leading order, 
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and without pileup. 12 We do not attempt to use matrix elements to model the multibody 
distribution of partons inside a jet, except in cases involving a heavy particle decay (such 
as a top quark). So, for example, for a QCD jet faking a hadronic diboson-jet, the jet 
is generated by a single hard parton, and its substructure is generated entirely by parton 
showering. As noted in Section [TU there is sensitivity to the exact showering model when 
dealing with hadronic substructure. In such cases we have erred on the conservative side by 
rescaling the background rate to match what would be expected from the HERWIG shower. 

We pass the showered and hadronized samples through the detector simulation described 
in Appendix El and proceed to reconstruct jets, leptons and Et- Finally, we apply substruc- 
ture methods and analysis cuts. 

For leptons, we smear the energies by 2% for electrons and by (5%)x 'E/TeV for muons. 
Subsequently, we only consider leptons with p? > 30 GeV and \rj\ < 2.5. We make a first 
pass over the event, searching for electrons and muons which are isolated in a traditional 
sense: scalar- summing the p? of all photons and hadrons within an rj—(j) cone of R = 0.4 
around the lepton, Pt(1)/(pt(1) + Pt (cone)) > 0.9. (Note that we do not include other 
nearby leptons in the cone.) Amongst these, any opposite-sign same-flavor pairs with a 
mass m(l + l~) = [81, 101] GeV are added together and treated as leptonic Z bosons. If, after 
Z boson clustering, there are no other isolated leptons in the event, then we check whether 
there are any non-isolated leptons, and keep the hardest one if it is mini-isolated as defined 
in Section III Bl All remaining leptons are considered "hadrons" for jet clustering, below. 

We then cluster all calorimeter cells (and non-isolated leptons) into jets using the Cam- 



bridge/Aachen algorithm with R = 1.5, as implemented in FastJet v2.4.2 [36|. These 
quasi-hemispheric fat-jets serve as the input into our jet substructure algorithms. We only 
consider events where the highest-p^ fat-jet has \rj\ < 1.5. In events with no reconstructed 
leptons (but possibly a leptonic Z), we decluster the hardest jet using the four-body hadronic 
diboson-tagger and require that it pass all of our internal phase space cuts on the four-subjet 
mass and subjet pairwise masses described in Section III AL All other jets we decluster into 



12 In our previous paper |4j, we found that the effects of pileup were easily controllable. For the present paper, 
we have also studied the application of trimming [20J . We assume that most charged pileup particles can 
be eliminated by vertexing, leaving over only photons and neutral hadrons. If we subsequently trim each 
event by pre-clustering into R = 0.2 anti-fcy jets [5^ and throwing away jets with px < a few GeV, then 
wc find that the influence of pileup on our reconstructions is largely eliminated. 
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two subjets with the BDRS mass-drop method 16]. We fractionally smear all subjet energies 
by 5 (GeV/p T ) © 0.5-^GeV/p T © 0.04. 

We reconstruct J£t by balancing the pr of all leptons and the subjets from the leading 
two decomposed fat-jets. 

The final analysis channel of an event depends on a small set of baseline criteria applied 
to the subjets, leptons, and $Jt- In all cases with one lepton, we attempt to match it with the 
hardest subjet-pair within AR < 1.5 to form a semi-leptonic diboson candidate. We then 
classify the event based on the activity on the opposite side of the detector (A</> > 7r/2 away 
from the vector-summed subjets and lepton): a leptonic Z, a Z-candidate subjet-pair with 
m = [76, 106] GeV, or substantial $t- (A jet is only considered as a hadronic Z-candidate if 
it is also the hardest jet in the event.) If the recoiling Z is visible, we apply the semileptonic 
diboson-tag incorporating the lepton's sister neutrino, whereas if the Z is invisible we only 
utilize the lepton and matched subjets. The full semileptonic tag criteria are described in 
Section III BL In cases with two leptons, which serve as dileptonic diboson candidates, we 
require that the leptons be within AR < 1.5 of one another. Again, we determine the final 
channel by checking whether the opposite side of the detector is consistent with leptonic 
Z, hadronic Z, or invisible Z. Finally, if there is no lepton, then the hardest jet should 
be a hadronic diboson candidate, and we only consider cases where it is associated with a 
leptonic Z or frr comparable to the jet pt- This defines our eight analysis channels. 

All channels under consideration, except for hadronic W^W/M with leptonic Z, contain 
neutrinos. In such cases, in order to facilitate approximate Higgs and Z 1 reconstruction, we 
define a single effective neutrino by using the $t vector as the neutrino's pt vector. We then 
set the neutrino's rapidity equal to that of the visible diboson decay products. In general, 
the reconstructed Z' mass, m™, is defined as the invariant mass of the four- vector sum 
of the visible diboson decay products, visible Z decay products (if any), and the effective 
neutrino. 

We subsequently apply cuts to define a signal region, and determine the efficiency for 
the signal and the cross section for each background. 13 The cuts are given in Tables IVII 



13 When denning signal efficiencies, we do not allow cross-talk between the different decay channels and their 
corresponding search channels. For example, decays involving taus can also produce (mini-)isolatcd lep- 
tons, and can sometimes get picked up by one of our leptonic search channels. However, when determining 
the efficiency for a given leptonic channel, we do not include signal events with taus. 
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through IVIIIl We note that none of these cuts has been rigorously optimized in the sense of 
a computer scan, but they have all simply been tuned by eye to reject as much background 
as possible while keeping 0(1) signal efficiency 

Our final results appear in Tables IPX! through IX VII 14 We determine the signal efficiencies 
and background cross sections for a trial set of Z' masses: 1,2, and 3 TeV. (Many of the 
backgrounds that we checked are subleading, in the sense that together they add up to only 
0(10%) or less of the total. For brevity, we do not explicitly list their cross sections.) We 
extend these results to arbitrary Z' masses by quadratic interpolation of the signal efficiencies 
and of the logarithms of the individual background cross sections. 



14 To model the contribution of the W+jet background in the (vv)(qq' qq') channel, we have simply rescaled 
the Z+jet contribution by 1.5. This brings us into rough agreement with existing monojet searches 



(e.g., 59() without requiring us to explicitly model how well a leptonic W can be vetoed. 
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Fully hadronic WW^*) associated with 


Leptonic Z 


Invisible Z 


p T (h) > mf, co /3 




m z > - 15% < m^?° <m z , + 15% 


m z >- 30% < m r J, co <m Z i + 10% 



TABLE VI: Kinematic cuts imposed for the analysis of Z' — > Zh — > Z(qq'qq'). 



Semileptonic W^H - ^*) associated with 


Leptonic Z 


Hadronic Z 


Invisible Z 


PtUwJw) > 0.1 x p T (Z) 
Pt (Z) > m r |, co /3 


Pr(jwjw) > 0.3 x fi T 
Pt{32) < 0.3 x prijwjw) 


m z > - 15% < m™?° <m z , + 15% 


m z , - 10% < mf, co < m z > + 10% 


mf, co > m Z '/2 



TABLE VII: Kinematic cuts imposed for the analysis of Z' — > Zh — > Z{luqq'). 22 represents the 
sum of subjets from the second-hardest fat-jet. 



Dileptonic WW^*) associated with 


Leptonic Z Hadronic Z 


Invisible Z 


h(125): m(l+qv) < 135 
/i(200): m(lp2 v ) < 210 


10 < m(l+q) < 75 
10 < m(/^2 ) < 150 


p T (Z) > m™?°/3 


Pt(Ji) <0-3x fi T 


m z >- 10% < m r |, co < m z > + 10% 


m™?° >0.3xm z , 



TABLE VIII: Kinematic cuts imposed for the analysis of Z' — > Zh — > Z(lvlv). (All units are in 
GeV.) ji represents the sum of subjets from the hardest fat-jet. 





m z , = 1 TeV 


m z , = 2 TeV 


m z , = 3 TeV 


125 GeV 


200 GeV 


125 GeV 


200 GeV 


125 GeV 


200 GeV 


Signal Eff. 


21% 


17% 


28% 


30% 


31% 


32% 


qg qZ 

qq -)■ qZ 


2.0 
0.66 


0.54 
0.33 


6.9 xl0~ 2 
3.6 XlO- 2 


3.3 xl0~ 2 

2.4 xl0~ 2 


7.0 xl0~ 3 
4.4 xl0~ 3 


1.8 xl0~ 3 
1.7 xl0~ 3 


sublcading 


Ztb, Ztt, diboson, triboson, continuum Zh 



TABLE IX: Signal efficiency and background cross sections (in fb) after all cuts in the Zh — > 
(l + l~)(qq'qq') channel. Signal efficiency does not incorporate the channel's total BR of 0.0066 
(0.022) for 125 GeV (200 GeV) Higgs. 
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m z , =1 TeV 


m z , = 2 TeV 


m z i = 3 TeV 


125 GeV 


200 GeV 


125 GeV 


200 GeV 


125 GeV 


200 GeV 


Signal Eff. 


28% 


24% 


37% 


39% 


41% 


43% 


qg -> q(Z/W) 
qq ->• fl(Z/W) 


27 
11 


16 
5.0 


1.0 
0.50 


0.79 
0.25 


0.13 
5.1 xKT 2 


4.7 xl0~ 2 

3.8 xl0~ 2 


subleading 


Ztb, Ztt, single-top, diboson, triboson, continuum Zh 



TABLE X: Signal efficiency and background cross sections (in fb) after all cuts in the Zh — > 
{vv)(qq' qq') channel. Signal efficiency does not incorporate the channel's total BR of 0.020 (0.067) 
for 125 GeV (200 GeV) Higgs. 





m z , = 1 TeV 


m z , = 2 TeV 


m z > = 3 TeV 


125 GeV 


200 GeV 


125 GeV 


200 GeV 


125 GeV 


200 GeV 


Signal Eff. 


21% 


21% 


28% 


33% 


28% 


36% 


ZWj 
Ztt 

triboson 


1.3 xlO" 2 
4.2 xl0~ 3 
6.8 xlO" 4 


1.3 xl0~ 2 
6.1 xl0~ 3 
1.7 xlO" 3 


5.7 xl0~ 4 
9.7 xl0~ 5 
5.2 xlO -5 


4.2 xl0~ 4 
1.6 xl0~ 4 
1.8 xl0~ 4 


3.9 xl0~ 5 
7.0 xl0~ 6 
5.7 xlO -6 


3.9 xlO- 5 
1.2 xl0~ 5 
2.4 xl0~ 5 


subleading 


Zj, Ztb, continuum Zh 



TABLE XI: Signal efficiency and background cross sections (in fb) after all cuts in the Zh — > 
(l + l~)(lvqq') channel. Signal efficiency does not incorporate the channel's total BR of 0.0044 
(0.015) for 125 GeV (200 GeV) Higgs. 





m z i = 


1 TeV 


m z , = 


2 TeV 


m z , = 


3 TeV 


125 GeV 


200 GeV 


125 GeV 


200 GeV 


125 GeV 


200 GeV 


Signal Eff. 


18% 


20% 


25% 


32% 


25% 


34% 


luj 


1.3 


1.1 


2.1 xl0~ 2 


2.5 xl0~ 2 


8.5 xl0~ 4 


9.5 xKT 4 


it 


0.81 


0.48 


1.0 xl0~ 2 


4.5 xlO -3 


1.1 xlO -3 


6.2 xlO -4 


ZWj 


0.15 


0.14 


5.4 xl0~ 3 


3.6 xlO" 3 


2.7 xl0~ 4 


7.3 xlO" 4 


subleading 


Zj, diboson, triboson, single-top, Ztb, Ztt, continuum Zh 



TABLE XII: Signal efficiency and background cross sections (in fb) after all cuts in the Zh — > 
(vv)(lvqq') channel. Signal efficiency does not incorporate the channel's total BR of 0.013 (0.044) 
for 125 GeV (200 GeV) Higgs. 
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mz> = 


1 TeV 


m z , = 


2 TeV 


m z 


= 3 TeV 




125 GeV 


200 GeV 


125 GeV 


200 GeV 


125 GeV 


200 GeV 


Signal Eff. 


14% 


13% 


19% 


23% 


17% 


21% 


tt 


5.1 


4.2 


7.2 xlO" 2 


4.7 xlO" 2 


4.9 xlO" 3 


3.4 xlO" 3 


Wjj 


4.2 


2.0 


5.8 xlO" 2 


5.4 xlO" 2 


1.1 xlO" 2 


4.8 xlO" 3 


subleading 


QCD dijets, (W/Z)j, diboson+jet, triboson, single-top, continuum Zh 



TABLE XIII: Signal efficiency and background cross sections (in fb) after all cuts in the Zh — > 
{qq){lvqq') channel. Signal efficiency does not incorporate the channel's total BR of 0.046 (0.16) 
for 125 GeV (200 GeV) Higgs. 





m z , = 


1 TeV 


m z , = 


2 TeV 


m z , = 


3 TeV 


125 GeV 


200 GeV 


125 GeV 


200 GeV 


125 GeV 


200 GeV 


Signal Eff. 


35% 


32% 


46% 


42% 


48% 


41% 


l+l-j 


0.44 


0.71 


1.4 xlO" 2 


1.9 xlO -2 


1.0 xlO" 3 


1.4 xlO" 3 


T + T -j 


0.33 


0.35 


1.1 xlO" 2 


1.1 xlO" 2 


7.4 xlO" 4 


7.8 xlO" 4 


ttj 


0.24 


0.66 


6.6 xlO" 3 


2.0 xlO -2 


3.9 xlO" 4 


1.1 xlO -3 


zi+r ->• jji+i- 


5.9 xlO" 2 


8.4 xlO" 2 


2.4 xlO" 3 


3.0 xlO" 3 


1.8 xlO" 4 


2.7 xlO" 4 


w+w-j 


2.8 xlO" 2 


7.1 xlO" 2 


1.5 xlO" 3 


3.4 xlO" 3 


1.1 xlO" 4 


3.0 xlO" 4 


subleading 


triboson, continuum Zh 



TABLE XIV: Signal efficiency and background cross sections (in fb) after all cuts in the Zh — > 
(qq)(lulu) channel. Signal efficiency does not incorporate the channel's total BR of 0.0075 (0.025) 
for 125 GeV (200 GeV) Higgs. 





m z , = 1 TeV 


m z , = 2 TeV 


m z i = 3 TeV 


125 GeV 


200 GeV 


125 GeV 


200 GeV 


125 GeV 


200 GeV 


Signal Eff. 


55% 


50% 


60% 


59% 


63% 


56% 


Zl+l- -> 41 incl. t's 


2.6 xlO" 2 


4.0 xlO" 2 


1.5 xlO" 3 


2.0 XlO" 3 


1.5 xlO" 4 


2.0 xlO~ 4 


subleading 


Ztt, triboson, continuum Zh 



TABLE XV: Signal efficiency and background cross sections (in fb) after all cuts in the Zh — > 
{l + l~){lvlv) channel. Signal efficiency does not incorporate the channel's total BR of 0.00070 
(0.0024) for 125 GeV (200 GeV) Higgs. 
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m z , = 


1 TeV 


m z , = 


2 TeV 


m z , = 


3 TeV 


125 GcV 


200 GeV 


125 GcV 


200 GeV 


125 GeV 


200 GeV 


Signal Eff. 


45% 


43% 


51% 


48% 


54% 


52% 


(yv)(l+l~ ) incl. r's 


0.22 


0.22 


2.6 xl0~ 2 


2.9 xlO" 2 


6.1 xlO" 3 


6.3 xl0~ 3 


(lv)(l+l-) incl. r's 


0.19 


0.38 


1.0 xlO" 2 


2.3 xKT 2 


1.7 xlO" 3 


3.4 xlO" 3 


tt 


0.48 


0.73 


< 8 x 10" 4 


< 8 x 10" 4 


< 1 x 10" 4 


< 1 x 10~ 4 


subleading 


(lu)(lu), triboson, continuum Zh 



TABLE XVI: Signal efficiency and background cross sections (in fb) after all cuts in the Zh — > 
(isis)(lvlv) channel. Signal efficiency does not incorporate the channel's total BR of 0.0021 (0.0072) 
for 125 GeV (200 GeV) Higgs. 
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m™° (GeV) m™° (GeV) 



FIG. 10: Distributions of the reconstructed hadronic Z mass for 3 TeV Z' — > Zh — > (qq)(lvli>) 
(left) and quark-jets with pt — 1500 GeV (right) processed through the BDRS mass-drop procedure. 
Displayed are particle-level (black), simple ECAL and HCAL cells (red), ECAL cells rescaled per 
containing HCAL cell (pink), and ECAL cells rescaled per containing HCAL cluster (blue). 



Appendix B: Detector Model 

In a realistic detector, substructure-sensitive observables are affected by finite energy 
resolution and finite spatial resolution. Because the jets and subjets that we consider are 
very energetic, and calorimeter energy measurements become better with increasing energy 
(down to few-percent resolution), we do not expect that energy resolution effects will be a 
major obstacle. Spatial resolution, on the other hand, may pose a very serious fundamental 
limitation. For example, in our most energetic Z-jets and diboson-jets, individual subjets 
may lie inside of a single hadronic calorimeter cell. 

The ATLAS and CMS detectors consist of multiple subdetectors, of which the HCAL is 
the spatially coarsest. By combining tracker, ECAL, and HCAL information together, it is 



60j. However, it 



possible to extract a very detailed picture of the energy flow of the event 
is known that tracking becomes less reliable as jet energies increase, due to the increasing 
density of hits and the decreasing track curvatures. This may not be a fatal issue, but to be 
conservative we consider only the ECAL and HCAL. 



In a previous paper J4], we suggested a simple method to hybridize the ECAL's 4-5 times 
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FIG. 11: Distributions of the reconstructed fully hadronic 125 GeV Higgs mass for 3 TeV Z' — > 
Zh — > {yv) (qq'qq 1 ) (left) and quark-jets with p? — 1500 GeV (right) processed through our 4-body 
declustering procedure. Displayed are particle-level (black), simple ECAL and HCAL cells (red), 
ECAL cells rescaled per containing HCAL cell (pink), and ECAL cells rescaled per containing 
HCAL cluster (blue). 



better spatial resolution with the full ECAL+HCAL energy measurement: Within each 
HCAL cell and its associated block of ECAL cells, rescale the ECAL cell energies so that 
the sum of ECAL cells matches the full ECAL+HCAL energy. These rescaled ECAL cells 
then serve as the 4-vectors input into jet clustering and jet substructure. We found that 
this technique was very effective at correcting off the mass-distorting effects of the HCAL 
geometry, at least in the context of a toy calorimeter model with perfect energy-sampling 
cells. 

Here, we add some important extra layers of reality to our toy calorimeter by approxi- 
mately accounting for two additional effects: 1) The energy deposited by each particle will 
shower into several cells, and 2) Hadrons will often deposit energy in the ECAL due to 
nuclear interactions. The two effects partially cancel each other, as the former introduces 
spurious substructure, whereas the latter increases the share of jet energy collected in the 
more finely-segmented ECAL. 

To implement the spatial energy smearin g, w e continuously distribute the energy of each 



particle using the profile parametrization in 



61| , setting the Moliere radius equal to one cell 
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unit (individually for ECAL and HCAL). (See 62[ for a similar model.) For simplicity, we 
apply the smearing in rj— (j) space rather than in real space, which should be an adequate 
approximation except at high rj. ECAL and HCAL cells are respectively set to 0.02 and 
0.1 unit across. Our spatial smearing is specifically designed to furnish a good model of the 
CMS ECAL, and is likely somewhat pessimistic for the HCAL of either experiment. 

To implement the hadronic ECAL deposits, we apply a 65% probability for charged 
hadrons above an energy of 5 GeV to deposit some energy. The fraction of energy is taken 
from a simple linearly-falling distribution that hits zero probability at 100% deposition. 
These provide a coarse model of the CMS ECAL response to charged hadrons as observed 



in test beam data 
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64|. Including this effect increases the typical EM fraction of a jet 
from about 25% (mainly 7r° — > 77) to 40%. In either case, the RMS of the distribution is 
comparable to half of the mean. More realistic jet EM fractions in the complete detector 



would be 55% for CMS 



6jj and 80% for ATLAS |66j, with similar relative RMS. 



We can see the effects of our calorimeter model on — 1500 GeV Z-jet and Higgs- 
jet mass reconstructions in Figs. HO] and [TTJ respectively. (Examples of diboson-jet event 
displays appear in Fig. [TJ) If we simply add up the ECAL and HCAL cell 4- vectors, there 
is an obvious degradation of the signal peaks. Perhaps more worrisome, the pr scale and 
spatial resolution scale interplay to create an artificial mass feature in the background QCD 
jets at about 100 GeV. Applying our naive ECAL rescaling brings us closer to particle- 
level, but there is still an obvious bias. To achieve a better correction, we can apply a 
slight modification to the ECAL rescaling. We first "undo" the spatial smearing effects by 
clustering HCAL cells with the anti-fcr algorithm 58j with i? = 0.17 (smaller than 2 cells but 
bigger than \/2 cells). We then rescale all ECAL cells associated with each cluster of HCAL 
cells, rather than cell-by-cell. The result of this procedure also appears in Figs. [10] and ITTj, 
and clearly indicates that most of the remaining bias has been removed. Not only is the 
signal peak approximately restored, but the artificial intrinsic mass scale of the background 
jets has been moved down to O(20 GeV). 

The quality of the final mass reconstruction may seem somewhat surprising, given that 
the fraction of energy deposited in the ECAL fluctuates a lot. The mass of a quasi-collinear 



subjet pair goes like pxAR^ z(l — z), where pt is the total transverse momentum of the 
pair, AR is their r/-0 distance, and z = [0, 0.5] parametrizes their energy sharing. With 
our procedure, pr and AR are very well-measured, and the potential problem is in the z 
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measurement. For subjets with equal energy sharing, the \J z{l — z) factor is 0.5. Even if we 
mismeasured the energy sharing by a factor of 2, so that z —> 0.25, the result shifts by only 
14%. More asymmetric splittings suffer more: z = 0.1 mismeasured as z = 0.05 changes the 
mass by 27%. However, averaging over many subjet configurations and EM fractions, the 
net mass smearing effect is rather modest. 

There are many other aspects of the real calorimeters that we did not consider, such as 
shower fluctuations, noise, showering in the tracker, and magnetic field effects. We do not 
expect that these will significantly alter our conclusions, but they would certainly need to 
be accounted for to verify that our cluster-rescaling trick works as advertised. We also note 
that this type of procedure should be useful for other jet substructure applications that 
require fine angular resolution, such as high-p^ top-tagging. 
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